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ABSTRACT 


We  considers  a  process  in  which  rewards  are  being  earned 
and  for  which  there  exist  time  points  at  which  the  process 
begins  anew. —  That  is,  we  -suppose  that  there  exists  an 
embedded  renewal  process.  An  expression  for  the 
asymptotic  mean  reward  earned  during  any  time  interval  is 
then  obtained.  In  the  final  section  we  consider  the  special 
case  of  a  regenerative  reward  process,  and  we  present  a 
simple  expression , for  the  long  run  average  reward  earned 
per  unit  time.( 


RENEWAL  REWARD  PROCESSES 
by 

Mark  Brown  and  Sheldon  M.  Ross 


INTRODUCTION 


Let  X^,X2,  ....  be  the  interarrival  times  for  a  renewal  process  with 
interarrival  distribution  F  .  Suppose  that  at  the  time  of  the  ith  renewal  we 
receive  a  reward  Y^  .  Y^  may  depend  on  ,  but  it  is  assumed  that  the 
pairs  (X^,Y^)  ,  i  ■  1,2,  ...,  are  Independent  and  identically  distributed. 

If  we  let 


N(t) 

Y(t) =  I  Y  , 
i-1 


where  N(t)  is  the  number  of  renewals  by  time  t  ,  then  Y(t)  represents  the 

total  reward  earned  by  time  t  .  The  stochastic  process  {Y(t),t  >  0}  is  called 

a  renewal  reward  process.  In  the  first  section  of  this  paper,  we  will  prove  the 

analogue  of  Blackwell's  theorem  for  a  renewal  reward  process.  In  Section  3  we 

will  consider  processes  of  the  form  Y(t)  -J  V(s)ds  ,  where  V  is  a  real-valued 

0 

regenerative  process.  An  important  result  of  Smith  [5],  p.  262,  asserts  under 

Y(t) 

mild  conditions  that  converges  a.s.  and  in  expectation  to  *  where 

is  the  expected  value  of  the  Integral  of  V  over  a  regenerative  cycle  and 
is  the  expected  length  of  the  regeneration  cycle.  Our  result  is  that 
*1^1  "  EV(“)  »  where  V(°°)  is  the  limiting  distribution  of  V(t)  . 
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L.  BLACKWELL'S  THEOREM  FOR  RENEWAL  REWARD  PROCESSES 
The  following  proposition  is  well  known: 


Proposition  1: 

If  either  EY^  or  EX^  is  finite,  then 


«>  -^-11 

t-»°°  1 


with  probability  1,  and 


A  proof,  based  on  a  Tauberlan  theorem,  is  given  by  Johns  and  Miller  [2] 
and  credited  to  Bell.  The  proposition  also  follows  from  a  more  general  result 
of  Smith  [5].  Part  (i)  of  the  above  is  clearly  the  analogue  of  the  elementary 
renewal  theorem.  We  shall  now  prove  the  analogue  of  Blackwell's  theorem. 


Theorem  1: 

If  EY^  <  <*>  ,  F  is  not  lattice,  and  EX^Y^  <  00  *  then 


lim  E[Y(t  +  h)  -  Y(t)3 


for  all  h  >  0  . 


Proof: 

Let  m(t)  -  E(N(t)  ]  .  Now, 


(1) 


E[Y(t)] 


(m(t)  +  1)EY1 


-  EIY»(t)+i> 
-  E(\(t)+i 


i 


3 


where  the  last  identity  follows  from  Wald's  equation.  Hence, 


E[Y(t  +  h)  -  Y(t)]  -  <m(t  +  h)  -  n(t))EY1  -  E[YN(t+h)+1  -  YH(t)+1l  , 

and  the  result  would  follow  from  Blackwell's  theorem  if  we  can  show  that 
lim  EtYN(t)+il  exists  and  is  finite.  Toward  this  end,  let  g(t)  -  EtYfl(t)+l^ 


00 

g(t)  -  f  EIY„(t)+1  I  Xj  -  x]dF(x) 


00 

f  "X1 1 X1 


x]dF(x)  +  /  g(t  -  x)dF(x) 


This  renewal  type  equation  has  the  solution 


where 


g(t)  -  h(t)  +  /  h(t  -  x)dm(x)  , 


00 

/  "Yi  I  xi 


x]dF(x)  . 


Suppose  now  that  all  rewards  are  nonnegative;  then  by  the  key  renewal  theorem, 
it  follows  that 


lim  g(t) 


HI  HI  w 

/  h(t)dt  // F[Y.  |  X.  -  xJdF (x)dt 

0  o  t  1  1 


00 

f  xE[Y.  |  X  -  x]dF(x) 

0  1  _  EX1  1 


where  the  interchange  of  integrals  is  justified  by  the  nonnegativity  of  rewards. 
In  the  general  case,  the  result  may  be  proven  by  breaking  up  the  rewards  into 
their  positive  and  negative  parts  and  applying  the  above  argument  separately  to 
each. 


Remark  1: 

The  proof  of  Theorem  1  may  also  be  used  to  prove  (ii)  of  Proposition  1.  This 
is  done  in  the  following  manner:  Assume  first  that  EY^  <  »  .  Then,  from 
Equation  (1)  and  the  elementary  renewal  theorem,  it  follows  that  (ii)  holds  if 

-*■  0  .  This,  however,  easily  follows  from  (2),  the  assumption  that  EY^  <  00  , 
and  the  elementary  renewal  theorem.  If  EY^  ■  »  (but  EX^  <  ®)  ,  then  the  result 
follows  by  truncation. 

In  the  above,  we  have  assumed  that  the  rewards  are  earned  at  the  end  of  the 
renewal  intervals.  However,  in  many  applications  the  rewards  (or  costs)  are 
earned  gradually  during  the  renewal  intervals.  For  instance,  in  an  inventory 
model  for  which  an  (s,S)  policy  is  employed,  the  costs  are  gradually  incurred 
during  the  renewal  cycle.  In  order  to  generalize  Theorem  1  to  include  this 
possibility,  let  W(s)  denote  the  expected  reward  earned  during  the  first  s 
time  units  of  a  renewal  interval  of  length  greater  than  s  .  Then,  the  expected 
reward  earned  by  t  ,  EY(t)  will  be  given  by 


N(t) 

E[Y(t) ]  -  E  l  Y 
_i-i 


+  E[W(Z(t))J 


where 


T 

i-l 


X 


i 


Z(t)  -  t  - 
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is  the  age  of  the  renewal  process  at  tine  t  .  Let 


F(a)  -  f  (1  -  F(t  -  y))dm(y)  ,  a  <  t 


Ft(a) 


a  >  t  . 


It  is  well  known  that  F  is  the  distribution  of  . 

Following  Smith  [6],  p.  11,  define  G  to  be  the  class  of  all  distributions 
F  on  [0,®)  having  the  property  that  for  some  K  ,  the  Kth  iterated  convolution 
of  F  with  Itself  has  an  absolutely  continuous  component. 


Theorem  2; 

(i)  If  F  is  nonlattice,  EY^  <  00  ,  EX^  >  *  ,  EX^Y^  <  ®  ,  then  if  W 
is  continuous  and  uniformly  lntegrable  with  respect  to  the  family 


{Ft>t  >  0}  ,  then 


lim  E[Y(t  +  h)  -  Y(t) ]  -  h 


(ii)  Under  the  conditions  of  (i)  but  with  F  e  G  ,  (3)  holds  iff  W  is 
uniformly  lntegrable  with  respect  to  (Ft»t  >  0)  (W  need  not  be 
continuous) . 


Proof : 


(i)  It  follows  from  Smith  [5],  p.  259  condition  B  ,  that  Z(t)  converges 
in  distribution  to  a  random  variable  with  c.d.f. 


(1  -  F(x))dx 


Fe (a)  - 
e 
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m 

Thus  E(V(Z(t)))  'onverges  to  £  V(x)dF^(x)  by  the  Helly-Bray  theory 

(LotSve  [3],  p.  183).  Hence,  the  result  follows  f roa  Theorem  1. 

(11)  It  follows  f roa  Smith  [5],  p.  239  condition  C  ,  thst  F(Z^  c  A) 
converges  for  sll  Borel  sets  A  to  F#(A)  •  f  (1  -  F(x))da/EXj  . 


Let  W* (x)  - 


W(x)  .  |w(x)|  <  s 


|H(*)|  >  s  . 


Let  W*-6  be  s  simple  function  having  the  property  that  sup  U*(m)  -  y*,d(m),  » 


W  '  can  be  chosen  by  chcosl 
^  <  W(x)  <  ^ 


ng  ^  <  6  and  letting  tf*,4(x)  -  ^ 


Note  that  the  strong  convergence  in  distribution 


Implies  that  E^.  W*’4  •  Ep  W*’4  for  all  a  ,  6  .  The  result  now  follows  by: 
t  e 


■F  W  -  Ep  W|  <  |Ef  u  -  Ef  W*|  ♦  |Er  W*  -  Ef  W4-6!  4  |Er  W*’d  -  Ep  U* 


e  e 


♦  |E  W*-4  -  E  W*|  ♦  |E  V*  -  E  Ul 
t  t  t  t 


Nov  the  1st  and  5th  terms  on  the  right  go  to  0  uniformly  In  t  as  a  •  •  ,  by 
assumption.  The  2nd  and  4th  terms  on  the  right  go  to  0  uniformly  in  t  ,  tor 
fixed  a  ,  as  6  -*  0  .  The  3rd  tern  goes  to  0  as  t  -  •  for  timed  a  ,  i  . 
Thus,  by  flrjt  choosing  a  sufficiently  large,  then  flaing  a  and  choosing  4 
sufficiently  small,  and  then  fixing  a  ,  4  and  choosing  t  sufficiently  large, 
we  can  make  the  right-side  smaller  than  any  preassigned  c  »  0  . 

The  necessity  of  uniform  lntegrabl 1 1 ty  follows  from  an  argument  In  Loivc  (3), 


p.  183. 


Let  { V ( t ) ,t  >0)  be  a  regenerative  process  [5],  p.  256,  with  imbedded 
generalised  renewal  sequence  (X^.i  >  0}  .  By  generalized  renewal  sequence  we 
•can  that  X  is  Independent  of  the  i.i.d.  sequence  {X. ,i  >  0}  but  nay  have  a 
different  distribution.  The  randoa  elements  V(t)  take  values  in  an  abstract 
measurable  space  (F,A)  .  If  P  ,  the  distribution  of  X^  ,  belongs  to  G  and 
if  iij  -  EXj  <  •  ,  then  it  follows  fro«  S«ith  [5],  p.  259,  that: 


Pr(X{  c  A)  •  I  Pr(V{  c  A,Xq  >  t  |  renewal  at  0)dt 
1 


for  all  A  c  A  . 


It  follows  from  Fublnl's  theorem  that  is  a  probability  measure  on  (F,A)  . 
If  in  addition  V  la  a  real  valued  process  with  a  measurable  modification  then 
It  follows  from  Salth  (5J,  p.  262,  that  ^  ^  V(s)ds  converges  a.s.  and  in 


expectation  to  •  wb*r*  •  I 


X  «*, 

J  «•» 


s  (assuming  ^  exists).  A 


natural  question  to  pose  Is  whether  or  not  ■  E (V (•) )  *  J  x  du^(x)  . 


We  will  show  that  this  Is  the  case. 

It  will  ?e  convenient  to  convert  the  lsbedded  renewal  process  {X^,i  >  0} 

Into  a  stationary  renewal  process.  This  can  be  done  by  inserting  a  renewal  to 

the  left  of  0  ,  Its  distance  from  0  having  the  same  distribution  as  the  limiting 

distribution  of  Z(t)  ,  the  age  of  the  renewal  process  at  time  t  ,  discussed  in 

Section  1.  Formally,  we  1st  (7j,l  •  0,:1,  ...)  be  a  doubly  Infinite  sequence  of 

n 

l.I.d.  randoa  variable.,  distributed  as  X,  .  Let  T  •  X'  ♦  T  X'  for 

1  n  o  r  1 

-n  1 

n»0,T  •  X'  -  l  X*  for  n  <  0  .  Then  (T  ,n  •  0,’l,  ...)  generates  a 

"  l-l  n 

strictly  stationary  renewal  process  on  {-•,•)  (see  (lj,  p.  162). 
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Start  the  regenerative  process  V  at  the  first  point  to  the  left  of 

0  .  Call  this  point  T°  and  call  the  resulting  regenerative  process  V*  .  Now 


(5) 


Pr(V^  e  A)  -  E 


c  A 


-tjl  -  F (t)dt  . 


But 


(6)  Pr(V  e  A,X  >  t  I  renewal  at  0)  ■  Pr(V.  e  A  I  X  >  t  ,  renewal  at  0)  . 

CO  C  O 

Pr(XQ  >  t  |  renewal  at  0)  ■  (1  -  F(t))Pr(Vt  e  A  |  Z(t)  ■  t) 

-  (1  -  F(t))Pr(v^  e  A  |  T°  «  -tj  . 

Thus,  from  (4),  (5),  (6) 


(7)  Pr(V’  e  A)  -  Pr(V(°°)  e  A)  -  v  (A)  . 

o  ® 

Assume  that  V  has  a  measurable  modification  and  that  eIv'I  <  .  This 

t  0 

implies  that  Y(t)  m^V'(s)ds  exists  a.s.  for  all  t  .  Start  V  with  a  renewal 

at  time  0  (thus  Xq  has  same  distribution  as  X^)  and  call  the  resulting  process 

V"  .  Define: 

V"(t)  ,  t  <  Xq 

V(t)  - 

0  t  >  X  . 

-  o 

X 

Then  J  |v"(s)|ds  ■  f  |W(s)|ds  ,  possibly  infinite.  Now 
0  0 
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K1 

-*■  -  EV'  -  EV  . 
Wj  o  ® 


^ommentt 


1.  degenerative  reward  processes  (real-valued  regenerative  processes) 

arise  frequently  in  queuing  theory.  They  are  often  of  the  form  V(t)  ■  W(S(t))  , 

where  W  is  a  real-valued  function,  and  S(t)  an  abstract  valued  regenerative 

WG 

process.  For  example  in  an  M/G/s  queue  with  <  s  ,  the  imbedded  renewal 
sequence  consists  of  epochs  at  which  busy  periods  begin  (the  interarrival  times 
satisfy  <  *  ,  F  e  G)  S(t)  consists  of  the  number  of  customers  in  service  at 
time  t  with  their  arrival  times,  and  the  number  of  customers  in  the  queue,  and 
W(S(t))  may  be  the  number  of  customers  in  service,  or  the  number  in  the  queue, 
or  the  unit  cost  of  the  service  system  for  handling  the  number  of  customers  present, 
or  an  indicator  variable 


I 
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1  if  number  In  queue  -  k 
W(S(t))  -  ,  etc. 

0  otherwise 


Assume  that  S  Is  a  regenerative  process  with  arbitrary  state  space 
(F,A)  and  jointly  measurable  as  a  map  from  (ft,C)  x  (R,B)  to  (F,A)  .  Here 
(n,C,P)  Is  the  probability  space  on  which  each  S(t)  is  defined,  R  the  real 
line  and  B  the  Borel  sets.  If  W  is  a  Borel  measurable  real-valued  function 
then  (V(t)  ■  W(S(t)),t  >0}  is  a  real-valued  measurable  regenerative  process. 

If  ^  <  «  ,  F  e  6  ,  then  since  Pr(S(t)  e  A)  ■*  Pr(S(°°)  e  A)  for  all  A  e  A  ,  it 
follows  that  Pr (W(S(t))  e  B)  -►  (y„w“1)(B)  -  Pr(W(S(«))  e  B)  ,  for  all  Borel  sets. 
Thus,  if  E|w(S(°°))|  <  «  then  it  follows  from  Theorem  3  that: 


(8) 


W(S(x))dx  -*•  E(W(S(»))) 


1  /* 

a.s.  and  in  expectation.  Note  also  that  if  E|~  f  W(S(x))dx|  -+■ 

"0 

for  some  pQ  >  1  ,  then  f  W(S(x))dx  -*■  E(W(S(°°)))  in  LP  ,  for 


Po 

E  j  W(S  («*>)  )  | 

0  <  P  <  Pc  • 


2.  If  EV (t)  converges  then  EV(«)  must  be  its  limit,  since  EV(°°)  ■ 

lim  /  E(V(s))ds  .  In  this  case  Theorem  A  is  trivial.  However,  E(V(t)) 
t-*»  V 

may  not  converge  and  Theorem  4  may  still  hold.  For  example,  start  with  a  renewal 


at  time  0  and  let  the  interarrival  time  c.d.f.  F  e  G  ,  have  an  atom  at  1  . 

Choose  a  regenerative  process  V  so  that  E(V(t)  |  Z(t)  -  1/2)  -  ®  , 

E(V(t)  |  Z(T)  >  0)  ■  0  .  Then  clearly  EV^n  +  «  “  for  all  Integers  n  ,  but 

EV (®)  -  0  .  A  necessary  and  sufficient  condition  for  convergence  of  EV(t)  to 
EV(“)  is  uniform  integrability  of  g(s)  ■  E(V(t)  |  Z(t)  •  s)  with  respect  to 
the  family  {Ft»t  >  0}  ,  discussed  in  Section  1. 


■Hi  4 


3.  Also  note  that  If  F  i  G  but  Smith's  alternative  conditions  [5],  p.  259 
hold,  so  that  Pr(Vfc  c  B)  Pr(Vw  e  B)  for  all  Bcrel  sets,  then  Theorem  5  still 
applies.  If  V(t)  does  not  have  a  limiting  distribution,  it  still  holds  that  if 
p  <  «  ,  V  has  a  measurable  modification  and  E|v'(0)|  <  «  ,  then 
V(s)ds  -*•  K./u-,  m  EV'(O)  ,  a. s.  and  in  expectation. 
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